Tool Use and Function Calling for Retrieval Agents

From OpenAI function calling to MCP — building dynamic tool selection for SQL, API, and vector search retrieval

Published

June 11, 2025

Keywords: tool use, function calling, MCP, Model Context Protocol, OpenAI tools, Anthropic tool use, LangChain tools, LlamaIndex FunctionTool, dynamic tool selection, SQL tool, API tool, vector search, tool registry, agent tools, retrieval agent, tool schema

Introduction

An agent is only as capable as the tools it can call. The ReAct loop gives agents the ability to reason and act, but tool use is the mechanism that connects reasoning to the outside world — databases, APIs, vector stores, file systems, and any other data source an agent needs to answer questions.

The landscape of tool use has evolved rapidly:

2023: OpenAI introduced function calling — structured JSON schemas that let models emit tool calls instead of free text
2024: Anthropic shipped tool use with input_schema / tool_result content blocks, and LLM frameworks unified tool definitions across providers
2025: The Model Context Protocol (MCP) emerged as an open standard for tool discovery and execution — a universal “USB-C port” for connecting AI applications to external systems

This article covers the full stack of tool use for retrieval agents — from provider-level function calling APIs, through framework-level tool abstractions in LangChain and LlamaIndex, to the protocol-level standardization of MCP. We build concrete retrieval tools for SQL databases, REST APIs, and vector stores, then wire them into agents that dynamically select the right tool for each sub-question.

From Text Parsing to Structured Tool Calls

The Evolution of Tool Use

Early agents relied on the LLM producing text in a specific format (Action: tool_name) that was regex-parsed by the host. This worked but was fragile — a single formatting mistake would break the tool call entirely.

Modern function calling moves the structure into the API itself:

graph TD
    subgraph TextParsing["Text-Parsed Tool Use"]
        A1["LLM generates text:<br/>Action: search<br/>Input: 'Paris'"] --> A2["Regex parser<br/>extracts tool + args"] --> A3["Execute tool"]
    end

    subgraph FunctionCalling["Structured Function Calling"]
        B1["LLM generates<br/>structured JSON tool_call"] --> B2["SDK parses<br/>tool_call object"] --> B3["Execute tool"]
    end

    subgraph MCP["Model Context Protocol"]
        C1["Agent discovers tools<br/>via tools/list"] --> C2["LLM selects tool<br/>+ structured args"] --> C3["Execute via<br/>tools/call JSON-RPC"]
    end

    TextParsing ~~~ FunctionCalling
    FunctionCalling ~~~ MCP

    style TextParsing fill:#F2F2F2,stroke:#D9D9D9
    style FunctionCalling fill:#F2F2F2,stroke:#D9D9D9
    style MCP fill:#F2F2F2,stroke:#D9D9D9
    style A1 fill:#e74c3c,color:#fff,stroke:#333
    style B1 fill:#4a90d9,color:#fff,stroke:#333
    style C1 fill:#27ae60,color:#fff,stroke:#333

Why Structured Tool Calls Matter

Aspect	Text Parsing	Structured Function Calling
Reliability	Fragile — regex breaks on format variations	Robust — guaranteed JSON schema conformance
Parallel calls	One tool per step	Multiple tools in a single response
Type safety	String-only arguments	Typed parameters (string, integer, array, object)
Validation	Manual	Automatic schema validation
Model support	Any LLM	Requires function-calling support
Debugging	Parse the `Action:` / `Input:` text	Inspect structured `tool_calls` objects

Bottom line: Use structured function calling for production agents with supported models. Fall back to text parsing only when working with open-source models that lack function-calling support.

OpenAI Function Calling

OpenAI’s function calling API lets you describe tools as JSON Schema objects. The model decides when to call a tool and emits a structured tool_calls response instead of free text.

Defining Tool Schemas

Each tool is a JSON Schema with a name, description, and parameter specification:

TOOL_SCHEMAS = [
    {
        "type": "function",
        "function": {
            "name": "search_knowledge_base",
            "description": "Search the internal knowledge base for product documentation, "
                           "API references, and how-to guides.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Natural language search query",
                    },
                    "top_k": {
                        "type": "integer",
                        "description": "Number of results to return (default: 5)",
                    },
                },
                "required": ["query"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "query_database",
            "description": "Execute a read-only SQL query against the analytics database. "
                           "Use this for structured data questions like counts, averages, and filters.",
            "parameters": {
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "SQL SELECT query to execute",
                    },
                },
                "required": ["sql"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "call_rest_api",
            "description": "Call an external REST API endpoint to fetch real-time data "
                           "like weather, stock prices, or service status.",
            "parameters": {
                "type": "object",
                "properties": {
                    "endpoint": {
                        "type": "string",
                        "description": "API endpoint path (e.g., '/users/123')",
                    },
                    "method": {
                        "type": "string",
                        "enum": ["GET", "POST"],
                        "description": "HTTP method (default: GET)",
                    },
                },
                "required": ["endpoint"],
            },
        },
    },
]

The Function Calling Loop

The agent loop sends tool schemas to the model, receives structured tool calls, executes them, and feeds back the results:

import json
from openai import OpenAI

client = OpenAI()


def run_tool_calling_agent(query: str, tools: dict, schemas: list, max_steps: int = 8) -> str:
    """Agent loop using OpenAI function calling."""
    messages = [
        {"role": "system", "content": "You are a helpful retrieval agent. "
         "Use the provided tools to answer questions accurately. "
         "Always verify facts with tools before answering."},
        {"role": "user", "content": query},
    ]

    for step in range(max_steps):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=schemas,
            tool_choice="auto",
            temperature=0,
        )

        msg = response.choices[0].message
        messages.append(msg)

        # No tool calls means the model is ready to answer
        if not msg.tool_calls:
            return msg.content

        # Execute each tool call (supports parallel calls)
        for tool_call in msg.tool_calls:
            name = tool_call.function.name
            args = json.loads(tool_call.function.arguments)

            if name in tools:
                result = tools[name](**args)
            else:
                result = f"Error: Unknown tool '{name}'"

            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": str(result),
            })

    return "Agent reached maximum steps without producing a final answer."

Parallel Tool Calls

A key advantage of structured function calling: the model can emit multiple tool calls in a single response. For example, when asked “What’s the weather in Paris and London?”, the model returns two tool calls simultaneously:

# The model may return multiple tool_calls in one response:
# msg.tool_calls = [
#     ToolCall(id="call_1", function=Function(name="get_weather", arguments='{"city":"Paris"}')),
#     ToolCall(id="call_2", function=Function(name="get_weather", arguments='{"city":"London"}')),
# ]
# Both are executed and their results returned before the next LLM call.

This reduces latency by eliminating sequential round-trips — the agent gets both results in one step instead of two.

Controlling Tool Selection

OpenAI’s tool_choice parameter controls when the model calls tools:

Value	Behavior	Use Case
`"auto"`	Model decides whether to call tools	Default — let the model reason
`"none"`	Model never calls tools	Force a text-only response
`"required"`	Model must call at least one tool	Ensure tool use on every step
`{"type": "function", "function": {"name": "..."}}`	Model must call a specific tool	Force a particular retrieval path

Anthropic Tool Use

Anthropic’s API uses a similar pattern with different message structures. Tools are defined with input_schema, and the model returns tool_use content blocks.

Defining Tools for Claude

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "search_knowledge_base",
        "description": "Search the internal knowledge base for product documentation.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Natural language search query",
                },
            },
            "required": ["query"],
        },
    },
    {
        "name": "query_database",
        "description": "Execute a read-only SQL query against the analytics database.",
        "input_schema": {
            "type": "object",
            "properties": {
                "sql": {
                    "type": "string",
                    "description": "SQL SELECT query to execute",
                },
            },
            "required": ["sql"],
        },
    },
]

The Tool Use Loop

def run_anthropic_agent(query: str, tools: list, tool_functions: dict, max_steps: int = 8) -> str:
    """Agent loop using Anthropic's tool use API."""
    messages = [{"role": "user", "content": query}]

    for step in range(max_steps):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

        # Check if the model wants to use tools
        if response.stop_reason == "end_turn":
            # Extract text content from the response
            for block in response.content:
                if block.type == "text":
                    return block.text
            return ""

        # Process tool use blocks
        assistant_content = response.content
        messages.append({"role": "assistant", "content": assistant_content})

        tool_results = []
        for block in assistant_content:
            if block.type == "tool_use":
                name = block.name
                args = block.input

                if name in tool_functions:
                    result = tool_functions[name](**args)
                else:
                    result = f"Error: Unknown tool '{name}'"

                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result),
                })

        messages.append({"role": "user", "content": tool_results})

    return "Agent reached maximum steps."

OpenAI vs. Anthropic: Schema Comparison

Aspect	OpenAI	Anthropic
Schema key	`parameters`	`input_schema`
Tool wrapper	`{"type": "function", "function": {...}}`	Top-level `{"name": ..., "input_schema": ...}`
Response format	`message.tool_calls` list	Content blocks with `type: "tool_use"`
Result format	`{"role": "tool", "tool_call_id": ...}`	`{"type": "tool_result", "tool_use_id": ...}`
Stop signal	No `tool_calls` in response	`stop_reason: "end_turn"`
Parallel calls	Multiple `tool_calls` in one message	Multiple `tool_use` blocks in one response
Strict mode	`strict: true` for guaranteed schema	`strict: true` for guaranteed schema

Despite the surface differences, the core pattern is identical: define schemas, let the model choose tools, execute them, return results.

Building Retrieval Tools

A retrieval agent needs tools that connect to real data sources. Here are three essential retrieval tools — each targeting a different data modality.

Tool 1: Vector Search Tool

For unstructured knowledge — documents, articles, support tickets:

from openai import OpenAI
import numpy as np

client = OpenAI()


class VectorSearchTool:
    """Search a vector store for semantically similar documents."""

    def __init__(self, index, documents, embedding_model="text-embedding-3-small"):
        self.index = index           # FAISS or similar index
        self.documents = documents   # List of document texts
        self.embedding_model = embedding_model

    def search(self, query: str, top_k: int = 5) -> str:
        """Search the vector store and return the top-k most relevant documents."""
        # Generate query embedding
        response = client.embeddings.create(
            model=self.embedding_model,
            input=query,
        )
        query_embedding = np.array(response.data[0].embedding, dtype=np.float32)

        # Search the index
        distances, indices = self.index.search(
            query_embedding.reshape(1, -1), top_k
        )

        results = []
        for i, (dist, idx) in enumerate(zip(distances[0], indices[0])):
            if idx < len(self.documents):
                results.append(f"[{i+1}] (score: {1-dist:.3f}) {self.documents[idx][:500]}")

        return "\n\n".join(results) if results else "No relevant documents found."

    def as_openai_schema(self) -> dict:
        return {
            "type": "function",
            "function": {
                "name": "search_knowledge_base",
                "description": "Search the knowledge base for documents relevant to the query. "
                               "Use this for questions about products, documentation, and procedures.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string", "description": "Natural language search query"},
                        "top_k": {"type": "integer", "description": "Number of results (default: 5)"},
                    },
                    "required": ["query"],
                },
            },
        }

Tool 2: SQL Database Tool

For structured data — metrics, user records, transaction history:

import sqlite3
from contextlib import contextmanager


class SQLTool:
    """Execute read-only SQL queries against a database."""

    def __init__(self, db_path: str, schema_description: str = ""):
        self.db_path = db_path
        self.schema_description = schema_description

    @contextmanager
    def _get_connection(self):
        conn = sqlite3.connect(self.db_path)
        conn.row_factory = sqlite3.Row
        try:
            yield conn
        finally:
            conn.close()

    def query(self, sql: str) -> str:
        """Execute a read-only SQL query and return results as formatted text."""
        # Validate: only allow SELECT statements
        normalized = sql.strip().upper()
        if not normalized.startswith("SELECT"):
            return "Error: Only SELECT queries are allowed."

        with self._get_connection() as conn:
            try:
                cursor = conn.execute(sql)
                rows = cursor.fetchmany(50)  # Limit results
                if not rows:
                    return "Query returned no results."

                columns = [desc[0] for desc in cursor.description]
                result_lines = [" | ".join(columns)]
                result_lines.append("-" * len(result_lines[0]))
                for row in rows:
                    result_lines.append(" | ".join(str(v) for v in row))

                total = cursor.fetchone()
                suffix = f"\n... (showing first 50 rows)" if total else ""
                return "\n".join(result_lines) + suffix
            except sqlite3.Error as e:
                return f"SQL Error: {e}"

    def get_schema(self) -> str:
        """Return the database schema for the LLM to reference."""
        with self._get_connection() as conn:
            cursor = conn.execute(
                "SELECT sql FROM sqlite_master WHERE type='table'"
            )
            schemas = [row[0] for row in cursor.fetchall() if row[0]]
            return "\n\n".join(schemas)

    def as_openai_schema(self) -> dict:
        return {
            "type": "function",
            "function": {
                "name": "query_database",
                "description": f"Execute a read-only SQL SELECT query against the database. "
                               f"Schema: {self.schema_description}",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "sql": {"type": "string", "description": "SQL SELECT query"},
                    },
                    "required": ["sql"],
                },
            },
        }

Tool 3: REST API Tool

For real-time data — external services, live metrics, third-party APIs:

import httpx
from urllib.parse import urljoin


class RESTAPITool:
    """Call a REST API endpoint to fetch real-time data."""

    def __init__(self, base_url: str, headers: dict | None = None, allowed_paths: list | None = None):
        self.base_url = base_url
        self.headers = headers or {}
        self.allowed_paths = allowed_paths  # Whitelist of allowed endpoint patterns

    def call(self, endpoint: str, method: str = "GET", params: dict | None = None) -> str:
        """Call an API endpoint and return the response."""
        # Validate endpoint against allowlist
        if self.allowed_paths:
            if not any(endpoint.startswith(p) for p in self.allowed_paths):
                return f"Error: Endpoint '{endpoint}' not in allowed paths."

        url = urljoin(self.base_url, endpoint)
        try:
            with httpx.Client(timeout=15) as http_client:
                if method.upper() == "GET":
                    resp = http_client.get(url, headers=self.headers, params=params)
                elif method.upper() == "POST":
                    resp = http_client.post(url, headers=self.headers, json=params)
                else:
                    return f"Error: Unsupported method '{method}'"

                resp.raise_for_status()
                return resp.text[:2000]  # Truncate large responses
        except httpx.HTTPStatusError as e:
            return f"HTTP Error {e.response.status_code}: {e.response.text[:500]}"
        except httpx.RequestError as e:
            return f"Request Error: {e}"

    def as_openai_schema(self) -> dict:
        return {
            "type": "function",
            "function": {
                "name": "call_rest_api",
                "description": "Call an external REST API to fetch real-time data.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "endpoint": {"type": "string", "description": "API endpoint path"},
                        "method": {
                            "type": "string",
                            "enum": ["GET", "POST"],
                            "description": "HTTP method (default: GET)",
                        },
                    },
                    "required": ["endpoint"],
                },
            },
        }

Wiring Tools into an Agent

# Initialize tools
vector_tool = VectorSearchTool(index=faiss_index, documents=docs)
sql_tool = SQLTool(db_path="analytics.db", schema_description="users, orders, products tables")
api_tool = RESTAPITool(
    base_url="https://api.example.com",
    headers={"Authorization": "Bearer <token>"},
    allowed_paths=["/status", "/users", "/metrics"],
)

# Build schemas and function registry
schemas = [
    vector_tool.as_openai_schema(),
    sql_tool.as_openai_schema(),
    api_tool.as_openai_schema(),
]

tool_functions = {
    "search_knowledge_base": vector_tool.search,
    "query_database": sql_tool.query,
    "call_rest_api": api_tool.call,
}

# Run the agent
answer = run_tool_calling_agent(
    query="How many users signed up last month and what does our onboarding doc say about retention?",
    tools=tool_functions,
    schemas=schemas,
)

The agent will:

Call query_database with a SQL query to count last month’s signups
Call search_knowledge_base to find onboarding documentation about retention
Synthesize both results into a single answer

Dynamic Tool Selection

How Agents Choose Tools

When an LLM receives a user query along with tool schemas, it uses three signals to decide which tool to call:

graph TD
    A["User Query"] --> B["LLM evaluates<br/>query + tool schemas"]
    B --> C{"Match tool<br/>descriptions"}
    C --> D["Signal 1:<br/>Semantic match<br/>between query and<br/>tool description"]
    C --> E["Signal 2:<br/>Parameter fit<br/>— does the query<br/>provide required args?"]
    C --> F["Signal 3:<br/>System prompt<br/>routing instructions"]
    D --> G["Rank tools<br/>by relevance"]
    E --> G
    F --> G
    G --> H["Emit tool_call<br/>for best match"]

    style A fill:#4a90d9,color:#fff,stroke:#333
    style B fill:#9b59b6,color:#fff,stroke:#333
    style D fill:#e67e22,color:#fff,stroke:#333
    style E fill:#e67e22,color:#fff,stroke:#333
    style F fill:#e67e22,color:#fff,stroke:#333
    style H fill:#27ae60,color:#fff,stroke:#333

Writing Effective Tool Descriptions

The tool description is the most important factor in tool selection. It tells the model when and why to use each tool:

# ❌ Bad — too vague, the model can't distinguish from other search tools
{
    "name": "search",
    "description": "Search for information",
}

# ✅ Good — specific about data source, content type, and when to use it
{
    "name": "search_support_tickets",
    "description": "Search resolved support tickets for known issues, workarounds, "
                   "and troubleshooting steps. Use this when the user reports a bug, "
                   "error, or asks about known issues. Returns ticket summaries with "
                   "resolution steps.",
}

Tool Description Guidelines

Guideline	Example
State the data source	“Search the PostgreSQL analytics database”
Describe the content	“Contains user records, orders, and product catalogs”
Specify when to use	“Use this for structured data questions like counts, averages, and filters”
Describe the output	“Returns rows as formatted text with column headers”
Mention limitations	“Read-only — only SELECT queries are supported”

Routing with System Prompts

For complex agents with many tools, add explicit routing instructions to the system prompt:

SYSTEM_PROMPT = """You are a retrieval agent with access to three data sources:

1. **knowledge_base** — Use for questions about documentation, procedures, and how-to guides.
   Best for: "How do I configure X?", "What is our policy on Y?"

2. **database** — Use for questions about metrics, user data, and transactions.
   Best for: "How many users signed up last week?", "What's the average order value?"

3. **rest_api** — Use for real-time data from external services.
   Best for: "What's the current status of service X?", "What's the live price of Y?"

For questions that span multiple sources, call the relevant tools in sequence.
Always verify facts with tools before answering — never guess."""

Tool Registries and Discovery

As agents gain access to more tools, managing them becomes a challenge. A tool registry centralizes tool definitions and enables dynamic discovery.

Building a Tool Registry

class ToolRegistry:
    """Centralized registry for agent tools with metadata and discovery."""

    def __init__(self):
        self._tools: dict[str, dict] = {}

    def register(self, name: str, function: callable, schema: dict, tags: list[str] | None = None):
        """Register a tool with its function, schema, and optional tags."""
        self._tools[name] = {
            "function": function,
            "schema": schema,
            "tags": tags or [],
        }

    def get_function(self, name: str) -> callable:
        """Get the callable function for a tool."""
        return self._tools[name]["function"]

    def get_schemas(self, tags: list[str] | None = None) -> list[dict]:
        """Get tool schemas, optionally filtered by tags."""
        if tags is None:
            return [t["schema"] for t in self._tools.values()]
        return [
            t["schema"] for t in self._tools.values()
            if any(tag in t["tags"] for tag in tags)
        ]

    def get_functions(self) -> dict[str, callable]:
        """Get all tool name -> function mappings."""
        return {name: t["function"] for name, t in self._tools.items()}

    def list_tools(self) -> list[str]:
        """List all registered tool names."""
        return list(self._tools.keys())


# Usage
registry = ToolRegistry()

registry.register(
    name="search_knowledge_base",
    function=vector_tool.search,
    schema=vector_tool.as_openai_schema(),
    tags=["retrieval", "documents"],
)

registry.register(
    name="query_database",
    function=sql_tool.query,
    schema=sql_tool.as_openai_schema(),
    tags=["retrieval", "structured-data"],
)

registry.register(
    name="call_rest_api",
    function=api_tool.call,
    schema=api_tool.as_openai_schema(),
    tags=["retrieval", "real-time"],
)

# Pass filtered schemas to the agent
retrieval_schemas = registry.get_schemas(tags=["retrieval"])
retrieval_functions = registry.get_functions()

Why Registries Matter at Scale

Tools Count	Challenge	Registry Solution
1–5	Manual management works	Simple dict/list
5–20	Hard to keep schemas in sync with implementations	Centralized registration with validation
20–100	Context window overflow — too many schemas	Tag-based filtering, send only relevant tools
100+	Neither humans nor models can manage	Dynamic discovery + tool search (MCP, Anthropic tool search)

For agents with many tools, tool search becomes essential — the agent first searches for relevant tools, then uses the selected tools to answer the query. This two-stage pattern keeps the context window manageable.

Model Context Protocol (MCP)

The Model Context Protocol is an open standard for connecting AI applications to external data sources and tools. Think of it as a universal adapter: instead of writing custom integrations for every tool and every AI application, you build an MCP server once and any MCP-compatible client can use it.

Architecture

graph TD
    subgraph Host["MCP Host (AI Application)"]
        A["LLM Agent"] --> B["MCP Client 1"]
        A --> C["MCP Client 2"]
        A --> D["MCP Client 3"]
    end

    B -->|"JSON-RPC<br/>(stdio)"| E["MCP Server A<br/>Local Filesystem"]
    C -->|"JSON-RPC<br/>(stdio)"| F["MCP Server B<br/>SQL Database"]
    D -->|"JSON-RPC<br/>(HTTP)"| G["MCP Server C<br/>Remote API"]

    style Host fill:#F2F2F2,stroke:#D9D9D9
    style A fill:#9b59b6,color:#fff,stroke:#333
    style B fill:#4a90d9,color:#fff,stroke:#333
    style C fill:#4a90d9,color:#fff,stroke:#333
    style D fill:#4a90d9,color:#fff,stroke:#333
    style E fill:#27ae60,color:#fff,stroke:#333
    style F fill:#27ae60,color:#fff,stroke:#333
    style G fill:#e67e22,color:#fff,stroke:#333

Key participants:

MCP Host: The AI application (e.g., Claude Desktop, VS Code Copilot, a custom agent) that coordinates one or more MCP clients
MCP Client: A component that maintains a dedicated connection to one MCP server
MCP Server: A program that exposes tools, resources, and prompts to MCP clients via JSON-RPC 2.0

MCP Primitives

MCP servers expose three types of capabilities:

Primitive	Purpose	Example
Tools	Executable functions the LLM can invoke	`query_database(sql)`, `search_docs(query)`
Resources	Data sources for contextual information	File contents, database schemas, API docs
Prompts	Reusable interaction templates	System prompts, few-shot examples

Tool Discovery and Execution

MCP uses JSON-RPC 2.0 for all communication. The lifecycle follows: initialize → discover tools → execute tools → handle notifications.

Step 1: Discover available tools

// Client sends:
{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}

// Server responds:
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "tools": [
      {
        "name": "search_documents",
        "description": "Search indexed documents by semantic similarity",
        "inputSchema": {
          "type": "object",
          "properties": {
            "query": {"type": "string", "description": "Search query"},
            "top_k": {"type": "integer", "description": "Number of results"}
          },
          "required": ["query"]
        }
      },
      {
        "name": "query_sql",
        "description": "Execute a read-only SQL query",
        "inputSchema": {
          "type": "object",
          "properties": {
            "sql": {"type": "string", "description": "SQL SELECT query"}
          },
          "required": ["sql"]
        }
      }
    ]
  }
}

Step 2: Execute a tool

// Client sends:
{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/call",
  "params": {
    "name": "query_sql",
    "arguments": {"sql": "SELECT COUNT(*) FROM users WHERE created_at > '2025-05-01'"}
  }
}

// Server responds:
{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "content": [{"type": "text", "text": "count: 1,247"}],
    "isError": false
  }
}

Building an MCP Server

Using the Python MCP SDK, you can expose your retrieval tools as an MCP server:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Retrieval Server")


@mcp.tool()
def search_knowledge_base(query: str, top_k: int = 5) -> str:
    """Search the knowledge base for documents relevant to the query.
    Use this for questions about products, documentation, and procedures."""
    # Connect to your vector store
    results = vector_store.similarity_search(query, k=top_k)
    return "\n\n".join(
        f"[{i+1}] {doc.page_content[:500]}" for i, doc in enumerate(results)
    )


@mcp.tool()
def query_database(sql: str) -> str:
    """Execute a read-only SQL SELECT query against the analytics database.
    Contains tables: users, orders, products, events."""
    if not sql.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed."
    try:
        result = db.execute(sql)
        return format_rows(result)
    except Exception as e:
        return f"SQL Error: {e}"


@mcp.resource("schema://database")
def get_database_schema() -> str:
    """Provide the database schema as a resource for the LLM to reference."""
    return db.get_schema()

Run the server:

# Stdio transport (local)
python retrieval_server.py

# HTTP transport (remote)
mcp run retrieval_server.py --transport http --port 8080

Connecting MCP Tools to Claude

MCP tool schemas map directly to Anthropic’s tool format — just rename inputSchema to input_schema:

from mcp import ClientSession


async def get_claude_tools(mcp_session: ClientSession) -> list[dict]:
    """Convert MCP tools to Claude's tool format."""
    mcp_tools = await mcp_session.list_tools()
    return [
        {
            "name": tool.name,
            "description": tool.description or "",
            "input_schema": tool.inputSchema,
        }
        for tool in mcp_tools.tools
    ]

Why MCP Matters for Retrieval Agents

Before MCP	With MCP
Custom integration per tool per agent	Build once, connect everywhere
Tools tightly coupled to agent code	Tools are independent servers
Static tool lists defined at build time	Dynamic discovery via `tools/list`
No standard for tool metadata	Standardized schemas, capabilities, notifications
Hard to share tools across teams	Publish MCP servers; any MCP client can connect

MCP is already supported by Claude, ChatGPT, VS Code Copilot, Cursor, and many other clients — making it the de facto standard for tool interoperability.

Tool Use with LangChain and LangGraph

LangChain provides a unified tool abstraction that works across LLM providers. Tools are defined once and automatically converted to the correct schema for OpenAI, Anthropic, or any supported model.

Defining Tools with `@tool`

from langchain_core.tools import tool


@tool
def search_knowledge_base(query: str, top_k: int = 5) -> str:
    """Search the knowledge base for documents relevant to the query.
    Use this for questions about products, documentation, and procedures."""
    results = vector_store.similarity_search(query, k=top_k)
    return "\n\n".join(doc.page_content[:500] for doc in results)


@tool
def query_database(sql: str) -> str:
    """Execute a read-only SQL query against the analytics database.
    Schema: users(id, name, email, created_at), orders(id, user_id, amount, date)."""
    if not sql.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed."
    return db.execute_and_format(sql)


@tool
def call_api(endpoint: str, method: str = "GET") -> str:
    """Call a REST API endpoint to fetch real-time data.
    Allowed endpoints: /status, /users/{id}, /metrics."""
    return api_client.call(endpoint, method)

Binding Tools to a Model

LangChain’s bind_tools converts tool definitions to the provider-specific format automatically:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
tools = [search_knowledge_base, query_database, call_api]

# bind_tools converts @tool definitions to OpenAI function schemas
llm_with_tools = llm.bind_tools(tools)

# The model now returns AIMessage with tool_calls when appropriate
response = llm_with_tools.invoke("How many users signed up last week?")
print(response.tool_calls)
# [{'name': 'query_database', 'args': {'sql': "SELECT COUNT(*) FROM users WHERE ..."}, 'id': '...'}]

Building a ReAct Agent with Tools

from langgraph.prebuilt import create_react_agent

agent = create_react_agent(
    model=llm,
    tools=tools,
    prompt="You are a retrieval agent. Always use tools to verify facts. "
           "For data questions use query_database, for documentation use "
           "search_knowledge_base, for live data use call_api.",
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "How many orders were placed yesterday "
                  "and what does our refund policy say?"}]
})

Custom Tool Classes

For tools that need initialization or state, use the BaseTool class:

from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field


class SQLQueryInput(BaseModel):
    sql: str = Field(description="SQL SELECT query to execute")


class SQLQueryTool(BaseTool):
    name: str = "query_database"
    description: str = "Execute a read-only SQL query against the analytics database."
    args_schema: type[BaseModel] = SQLQueryInput

    db_path: str

    def _run(self, sql: str) -> str:
        if not sql.strip().upper().startswith("SELECT"):
            return "Error: Only SELECT queries are allowed."
        conn = sqlite3.connect(self.db_path)
        try:
            cursor = conn.execute(sql)
            rows = cursor.fetchmany(50)
            columns = [desc[0] for desc in cursor.description]
            lines = [" | ".join(columns)]
            for row in rows:
                lines.append(" | ".join(str(v) for v in row))
            return "\n".join(lines)
        except sqlite3.Error as e:
            return f"SQL Error: {e}"
        finally:
            conn.close()


# Usage
sql_tool = SQLQueryTool(db_path="analytics.db")
agent = create_react_agent(model=llm, tools=[sql_tool, search_knowledge_base])

Tool Use with LlamaIndex

LlamaIndex offers two primary tool types: FunctionTool for wrapping arbitrary Python functions, and QueryEngineTool for wrapping RAG query engines.

FunctionTool

from llama_index.core.tools import FunctionTool


def search_knowledge_base(query: str, top_k: int = 5) -> str:
    """Search the knowledge base for documents relevant to the query."""
    results = vector_store.similarity_search(query, k=top_k)
    return "\n\n".join(doc.page_content[:500] for doc in results)


def query_database(sql: str) -> str:
    """Execute a read-only SQL query against the analytics database."""
    if not sql.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed."
    return db.execute_and_format(sql)


# Wrap as FunctionTools — metadata is extracted from docstrings
search_tool = FunctionTool.from_defaults(fn=search_knowledge_base)
sql_tool = FunctionTool.from_defaults(fn=query_database)

QueryEngineTool

The real power of LlamaIndex: wrap a full RAG pipeline as a tool that an agent can call:

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool

# Build a RAG index
documents = SimpleDirectoryReader("./knowledge_base").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)

# Wrap as a tool
rag_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="knowledge_base",
    description="Search the internal knowledge base for product documentation, "
                "API references, and how-to guides. Use this for questions about "
                "our products and procedures.",
)

Building an Agent with Multiple Tools

from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context

llm = OpenAI(model="gpt-4o-mini", temperature=0)

agent = ReActAgent(
    tools=[rag_tool, sql_tool, search_tool],
    llm=llm,
    system_prompt="You are a retrieval agent with access to a knowledge base, "
                  "a SQL database, and a document search tool. Always verify facts "
                  "with tools. Route questions to the most appropriate tool.",
)

ctx = Context(agent)
response = await agent.run(
    "What's our API rate limit and how many requests can 500 users "
    "make per hour if each user is limited to 100 requests per minute?",
    ctx=ctx,
)

ToolSpec for Third-Party Integrations

LlamaIndex’s BaseToolSpec lets you wrap entire third-party APIs as tool collections:

from llama_index.core.tools.tool_spec.base import BaseToolSpec


class AnalyticsToolSpec(BaseToolSpec):
    """Tools for querying the analytics platform."""

    spec_functions = ["get_user_count", "get_revenue", "get_top_products"]

    def __init__(self, api_client):
        self.client = api_client

    def get_user_count(self, start_date: str, end_date: str) -> str:
        """Get the number of active users between two dates.
        Dates should be in YYYY-MM-DD format."""
        result = self.client.get(f"/users/count?start={start_date}&end={end_date}")
        return f"Active users: {result['count']}"

    def get_revenue(self, period: str = "monthly") -> str:
        """Get revenue data. Period can be 'daily', 'weekly', or 'monthly'."""
        result = self.client.get(f"/revenue?period={period}")
        return f"Revenue ({period}): ${result['total']:,.2f}"

    def get_top_products(self, limit: int = 10) -> str:
        """Get the top-selling products by revenue."""
        result = self.client.get(f"/products/top?limit={limit}")
        lines = [f"{p['name']}: ${p['revenue']:,.2f}" for p in result["products"]]
        return "\n".join(lines)


# Convert to tools and add to agent
analytics_spec = AnalyticsToolSpec(api_client=analytics_api)
analytics_tools = analytics_spec.to_tool_list()

agent = ReActAgent(
    tools=[rag_tool] + analytics_tools,
    llm=llm,
)

LangChain vs. LlamaIndex Tool Comparison

Feature	LangChain / LangGraph	LlamaIndex
Tool definition	`@tool` decorator or `BaseTool` class	`FunctionTool.from_defaults()`
RAG-as-tool	Wrap retriever in a `@tool` function	`QueryEngineTool` — first-class RAG tool
Schema generation	Auto-generated from type hints + docstring	Auto-generated from function signature
Tool binding	`llm.bind_tools(tools)` — provider-agnostic	Passed to agent constructor
Third-party integrations	LangChain tool packages (`langchain-community`)	`BaseToolSpec` + LlamaHub tools
Calling convention	Native function calling via `bind_tools`	Text-parsed ReAct (Thought/Action/Observation)
Agent construction	`create_react_agent(model, tools)`	`ReActAgent(tools=[], llm=llm)`
Best for	Multi-provider tool calling, complex workflows	RAG-centric agents, rapid prototyping

Common Pitfalls and How to Fix Them

Pitfall	Symptom	Fix
Vague tool descriptions	Agent picks the wrong tool	Be specific: state data source, content type, when to use
Missing parameter descriptions	Agent passes wrong arguments	Add clear `description` to every parameter in the schema
No input validation	SQL injection, path traversal	Validate inputs at the tool level — whitelist queries, sanitize paths
Unbounded output	Tool returns 10MB of data, blows context window	Truncate results, add `max_results` parameters
Tool not found	Agent hallucinates tool names	Return clear error with list of available tools
No error handling	Agent crashes on tool failure	Wrap tool calls in try/except, return error strings
Too many tools	Model confused, slow, expensive	Use tool registries with tag filtering or tool search
Schema drift	Tool function signature changes but schema doesn’t	Generate schemas from code (LangChain `@tool`, LlamaIndex `from_defaults`)
No rate limiting	Agent calls expensive APIs in a loop	Add rate limits and cost budgets at the tool level

Conclusion

Tool use is the bridge between an LLM’s reasoning capabilities and the outside world. The ecosystem has converged on a clear stack:

Key takeaways:

Structured function calling (OpenAI’s tools parameter, Anthropic’s input_schema) replaces fragile text parsing with guaranteed JSON schema conformance. Use it for all production agents.
Tool descriptions are critical — they are the primary signal the LLM uses to select tools. Be specific about data sources, content types, when to use each tool, and what it returns.
Build three types of retrieval tools: vector search for unstructured documents, SQL for structured data, REST APIs for real-time external data. Together they cover most retrieval needs.
Tool registries centralize tool management and enable tag-based filtering when the tool count grows beyond what fits in a single context window.
MCP standardizes tool interoperability — build an MCP server once, and any compatible AI application can discover and use your tools via JSON-RPC. It’s the emerging “USB-C for AI tools.”
LangChain provides @tool decorators and bind_tools for cross-provider compatibility. LlamaIndex excels with QueryEngineTool for RAG-centric agents and BaseToolSpec for wrapping entire APIs.
Dynamic tool selection depends on good descriptions, well-designed schemas, and routing instructions in the system prompt. For 100+ tools, use tool search to narrow the set before the LLM sees them.

Start with a few well-described tools, verify the agent routes correctly, then scale to registries and MCP as your tool count grows.

References

OpenAI, Function Calling and Other API Updates, June 2023 — introduction of structured function calling for LLMs.
Anthropic, Tool Use Documentation, 2024 — input_schema / tool_result content blocks for structured tool calling.
Anthropic, Model Context Protocol (MCP) Specification, 2025 — open standard for tool discovery and execution via JSON-RPC.
LangChain, LangChain Tools Documentation, 2024 — unified @tool decorator and bind_tools for cross-provider compatibility.
LlamaIndex, FunctionTool and QueryEngineTool, 2024 — RAG-centric tool abstractions and BaseToolSpec for API wrapping.

Understand the ReAct reasoning loop that powers tool selection in Building a ReAct Agent from Scratch.
Connect vector search, SQL, and API tools to RAG pipelines with Building a RAG Pipeline from Scratch.
Add safety validation to tool inputs and agent outputs with Guardrails for LLM Applications.
Monitor tool calls and agent behavior in production with Observability for Multi-Turn LLM Conversations.
See how agents route queries across multiple retrieval sources in Agentic RAG: When Retrieval Needs Reasoning.